The impact of genetic structure on sequencing analysis

نویسندگان

  • Sneha Jadhav
  • Olga A. Vsevolozhskaya
  • Xiaoran Tong
  • Qing Lu
چکیده

BACKGROUND Genome-wide association studies have made substantial progress in identifying common variants associated with human diseases. Despite such success, a large portion of heritability remains unexplained. Evolutionary theory and empirical studies suggest that rare mutations could play an important role in human diseases, which motivates comprehensive investigation of rare variants in sequencing studies. To explore the association of rare variants with human diseases, many statistical approaches have been developed with different ways of modeling genetic structure (ie, linkage disequilibrium). Nevertheless, the appropriate strategy to model genetic structure of sequencing data and its effect on association analysis have not been well studied. METHODS We investigate 3 statistical approaches that use 3 different strategies to model the genetic structure of sequencing data. We proceed by comparing a burden test that assumes independence among sequencing variants, a burden test that considers pairwise linkage disequilibrium (LD), and a functional analysis of variance (FANOVA) test that models genetic data through fitting continuous curves on individuals' genotypes. RESULTS Through simulations, we find that FANOVA attains better or comparable performance to the 2 burden tests. Overall, the burden test that considers pairwise LD has comparable performance to the burden test that assumes independence between sequencing variants. However, for 1 gene, where the disease-associated variant is located in an LD block, we find that considering pairwise LD could improve the test's performance. CONCLUSIONS The structure of sequencing variants is complex in nature and its patterns vary across the whole genome. In certain cases (eg, a disease-susceptibility variant is in an LD block), ignoring the genetic structure in the association analysis could result in suboptimal performance. Through this study, we show that a functional-based method is promising for modeling the underlying genetic structure of sequencing data, which could lead to better performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gene-Gene Interaction Study Between Genetic Polymorphisms of Folate Metabolism and MTR SNPs on Prognostic Features Impact for Breast Cancer

Background: Breast Cancer (BC), the second leading cause of cancer mortality after lung cancer and varied across the world due to genetic and environmental factors. In this study, we evaluated the interaction between the polymorphisms in genes encoding enzymes of folate metabolism: methylenetetrahydrofolate reductase (MTHFR), methionine synthesis reductase (MTR) with the BC prognostic factors. ...

متن کامل

Impact of Genetic Variants in Mir-122 Gene and its Flanking Regions on Hepatitis B Risk

MicroRNAs are small non coding RNAs that are involved in gene expression regulation. Mir-122 was reported to inhibit hepatitis B virus (HBV), but little is known about the role of mir-122 polymorphisms on HBV infection development. This present study aimed to investigate the association between single nucleotide polymorphisms (SNPs) in mir-122 gene region with HBV infection. Study cases were HB...

متن کامل

Molecular Identification of Rare Clinical Mycobacteria by Application of 16S-23S Spacer Region Sequencing

Objective(s) In addition to several molecular methods and in particular 16S rDNA analysis, the application of a more discriminatory genetic marker, i.e., 16S-23S internal transcribed spacer gene sequence has had a great impact on identification and classification of mycobacteria. In the current study we aimed to apply this sequencing power to conclusive identification of some Iranian clinical ...

متن کامل

Population genetic studies of Liza aurata using D-Loop sequencing in the southeast and southwest coasts of the Caspian Sea

Genetic diversity as an important marker of the ecological status of aquatic ecosystems is considered a unique and powerful tool to evaluate biological communities. In order to evaluate the genetic diversity among golden mullet species (Liza aurata) in the southeast and southwest coasts of the Caspian Sea by D-Loop gene sequencing, a total of 23 fin specimens of golden mullet were collected fro...

متن کامل

Population genetic studies of Liza aurata using D-Loop sequencing in the southeast and southwest coasts of the Caspian Sea

Genetic diversity as an important marker of the ecological status of aquatic ecosystems is considered a unique and powerful tool to evaluate biological communities. In order to evaluate the genetic diversity among golden mullet species (Liza aurata) in the southeast and southwest coasts of the Caspian Sea by D-Loop gene sequencing, a total of 23 fin specimens of golden mullet were collected fro...

متن کامل

Population structure and variation in Persian sturgeon (Acipenser percicus ) from the Caspian Sea as determind from mitochondrial DNA sequences of the control region

Mitochondria1 DNA (mtDNA) control region sequences were analyzed to evaluate the population genetic structure of Persian sturgeon (Acipenser persicus) in Caspian Sea. A total of 45 specimens were collected from the different locations of the Caspian Sea. MtDNA control region was amplified using PCR. Direct sequencing was performed according standard method. The results showed that 12 haplotypes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2016